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Abstract. We present and analyze a class of nonsymmetric preconditioners within a 
normal (weighted least-squares) matrix form for use in GMRES to solve nonsymmetric 
matrix problems that typically arise in finite element discretizations. An example of the 
additive Schwarz method applied to nonsymmetric but definite matrices is presented for 
bJQf which the abstract assumptions are verified. A variable preconditioner, combining the 

original nonsymmetric one and a weighted least-squares version of it, is shown to be con- 
vergent and provides a viable strategy for using nonsymmetric preconditioners in practice. 
Numerical results are included to assess the theory and the performance of the proposed 
preconditioners. 



I. Introduction 

The numerical approximation of most phenomena in science and technology requires the 
solution of linear or nonlinear algebraic systems. Preconditioning is one of the main tech- 
niques that combined with a proper iterative method allows for reducing substantially the 
cost of solving those systems. Many efforts are usually devoted to design proper precondi- 
tioning strategies that allow for efficient and fast solution of the resulting algebraic systems 
[19| 120] . The development of preconditioners is very often guided by the properties of the 
underlying problem and it sometimes can even dictate particular aspects that should be 
accounted for, when devising the numerical discretization of the continuous problem (as 
for instance in [121 H] ) • 

Even for linear problems, the design and analysis of preconditioners for the linear systems 
is far from being complete. For symmetric and coercive problems, a reasonable discretiza- 
tion yields a linear system Aox = b with Aq symmetric and positive definite (s.p.d). In 
such it is well known, the spectral information of the matrix itself dictates com- 

pletely the convergence of the method. Therefore a preconditioner Bq would be uniform 
(and could be turned into an optimal preconditioner) if it captures completely such spectral 
information; in other words, if B is spectral equivalent to A . 
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However, for linear systems Ax. = b, with nonsymmetric coefficient matrix A, the design 
of effective preconditioners does not admit a general recipe, at least at the present time. 
Likewise, there is no general iterative solver, and furthermore, there is no general theory 
that could be used to explain the success of a particular preconditioner when it is indeed 
efficient. In most cases, the spectral information does not provide significant information 
that could guide the development of any good preconditioner. Field of values have shown 
some utility in certain circumstance, but have also shown many limitations jTOj [HI [16] . At 
the moment it might seem that each particular problem has to be studied separately and a 
problem-dependent, discretization-dependent preconditioning strategy had to be devised. 
Besides, even when such preconditioning can be designed, its understanding and analysis 
are tasks that in most cases are out of reach. 

In this paper, we focus on a particular situation, where A is nonsymmetric but still 
positive definite. The motivation and application comes from a nonsymmetric Discontinu- 
ous Galerkin discretization of an elliptic problem [8] . In [2J [3] , additive and multiplicative 
Schwarz preconditioners were developed for the solution of the resulting algebraic system. 
In both works, the authors show that the GMRES convergence theory cannot be applied 
for explaining the convergence since the preconditioned system does not satisfy the suffi- 
cient conditions required by such theory. However, such discretizations are used in practice 
and have already shown to have some advantages when approximating advection-diffusion 
problems [HJ [5] and more recently, in the design of methods for some more complex non- 
linear problems [T71 [T5] . In [6], the authors introduce a solver methodology based on the 
idea of subspace correction for this type of discretizations for elliptic problems, provid- 
ing the analysis of the resulting iterative methods without using any GMRES theory. In 
this paper we want to examine, in a more general algebraic abstract framework (that in 
particular will apply to the type of methods discussed above), the issue of providing some 
convergence theory for a preconditioner based on the classical (but nonsymmetric) Schwarz 
preconditioner to be used within GMRES. The ultimate goal is to obtain some insight on 
how to improve and tune the preconditioner. 

Here, in a first stage we consider two preconditioners for A; a classical additive Schwarz 
preconditioner B, which is nonsymmetric, and a symmetric preconditioner Z that basically 
uses actions of the additive Schwarz preconditioner B and its adjoint. Both will be shown 
to have their pros and cons. For the former, the non-symmetry of B and of B~ X A, pre- 
cludes developing any theory from which to extract either some a-priori information on the 
convergence or to provide some guidelines on how the preconditioner could be improved 
or even be designed. The latter, while allowing for developing a convergence theory, will 
be shown to be not the most efficient possible option, although the a-priori information on 
convergence could be of special value depending on the application. Other symmetrizing 
strategies for classical Schwarz methods, different from the one introduced here, have been 
already considered in literature by other authors pL3] . 

At first sight, the underlying message that one might obtain from the analysis of this 
first part, is that enforcing the symmetry of the preconditioner for a nonsymmetric matrix 
might result in the very end, in a wasted effort. We believe this might be the case in many 
situations, and we also think it is relevant and important to point it out. At the same time, 
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we do believe that the results obtained for analyzing the preconditioner, are of independent 
interest (also because of its simplicity), and might provide some basis (as it had happened 
already here) or insights for further development of solvers for nonsymmetric systems. 

In a second stage of the present paper, we introduce a variable preconditioner B that 
is constructed by considering a linear combination of two given (general) preconditioners 
B and Z so that in a sense it tries to integrate and exploit the best of each of them. We 
describe the construction of this variable preconditioner to be used in GMRES, explaining 
how the coefficients in its definition are determined at each iteration inside GMRES. We 
show that from the construction of B we immediately can deduce (theoretically) a conver- 
gence estimate that guarantees better performance of the resulting solver. In particular, we 
show that the new preconditioner outperforms the symmetric preconditioner Z and indeed 
always converges faster. 

The theory is illustrated with extensive numerical experiments, in which we also study 
the performance of all the considered preconditioners. They are all implemented in parallel 
to fully take advantage of having considered preconditioners based on additive Schwarz 
methods. In the numerical tests, we do observe that the combined preconditioner requires 
less GMRES-iterations to achieve convergence than the the classical additive Schwarz B. 
However, in this particular case, each iteration for the combined preconditioner is more 
costly, which in the end, makes B perform slightly better in terms of execution time. From 
these observations, it might be inferred that the new combined preconditioner B might 
be more competitive in settings where each iteration is expensive, so that the savings in 
iteration count can make up for the high cost per iteration. 

Although we have focused on the nonsymmetric but positive definite case, we believe 
the ideas presented in the paper might be useful and possibly extended to more complex 
problem, including the indefinite case. This issue will be subject of future research. 

The outline of the paper is as follows. Section [2] contains a description of the problem 
and the original motivation of it. In Section [3J we construct the preconditioner Z and 
present the convergence analysis. The combined preconditioner is introduced and analyzed 
in Section HI Finally in Section El we consider a particular application and we provide 
numerical experiments that verify the developed theory and assess the performance of the 
preconditioner. 



with A being non-symmetric but definite and n is assumed to be large. For the applications 
we have in mind, A comes from a finite element discretization of some partial differential 
operator and therefore is sparse and structured. 

We also assume that A is ill-conditioned and that therefore a good preconditioner B is 
required to solve efficiently system ( 12.1 j) by an iterative method. A simple option is to 
construct such B as the classical additive Schwarz preconditioner coming from A. More 



2. Problem formulation 




We are interested in preconditioning a given system of linear equations 
.1) Ax = b, Ae R nxn x, b G W l , 
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precisely, we denote by k = 1, . . . , m, a set of rectangular matrices, such that 1^ 
extends a local vector to a global vector Ik^k with zero entries outside its index set. 
Also, let I c = P be an interpolation matrix that maps a coarse vector v = v c to a 
global vector J c v c = Pv c . Then, the additive Schwarz preconditioner exploits the local 
matrices = fj^AIk, principal submatrices of A, and the coarse matrix A c defined as 
A c = I^AIc = P T AP. The inverse of the additive Schwarz preconditioner B takes the 
following familiar form: 



Obviously, since A is nonsymmetric the resulting additive Schwarz preconditioner B is 
also nonsymmetric. Therefore, for the solution of the resulting preconditioned system 
B~ 1 Ax = B~ 1 b, one has to use any of the iterative methods for nonsymmetric systems, 
such as the Generalized Minimal Residual (GMRES). For analyzing the convergence of the 
resulting iterative method (for the preconditioned system) one has to resort to one of the 
available and non-optimal GMRES theories. In the Domain Decomposition framework, 
the GMRES convergence theory of Eisenstat et. al. [5] is generally used. In particular, 
to derive (a-priori) any conclusion on the performance of the preconditioner B, this theory 
requires some control on the coercivity of B~ l A (in some inner product). Therefore, at 
least in theory, using B directly as a preconditioner for A might not be successful. 

Still, we would like to utilize the actions of B^ 1 to define a preconditioner, say Z, for 
A, for which some bounds on the rate of convergence can be a-priori determined. In the 
next section, we show how such a preconditioner Z can be constructed (and analyzed) by 
exploiting the fact that although A is nonsymmetric, it is positive definite in some inner 
product. We also compare numerically, in Section \5[ the performance of the constructed 
preconditioner Z with the original nonsymmetric additive Schwarz B. As we will show, 
even if a theory can be developed for Z it might not be the most efficient option. 

We now state our basic assumption regarding the matrix A. More specifically, we assume 
that there is an s.p.d. matrix Aq such that A and Aq are related by the following basic 
assumption: 

Assumption (HO): Let A E IR nxn be nonsymmetric but definite and let A e M. nxn be 
s.p.d. We say that the pair of matrices (A, A ) satisfy Assumption (HO) with constants 
(c , Ci) if they do satisfy the following coercivity and boundedness estimates: 



(2.2) 




k=l 



(2.3) 



v Av > Co v Aqv for all v, 



(2.4) 



w T Av < c\ \J v T v4 v \J w T A w for all v, w. 



3. An abstract result 



In this section we present the construction and give the analysis of a preconditioner for 
A that basically only uses the actions of the additive Schwarz method. We start proving a 
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couple of Lemmas that will be required for our subsequent analysis and derivation. 

The next Lemma shows that for any pair of matrices (A, A ) satisfying (HO) with con- 
stants (co,ci), the corresponding pair (A -1 , Aq 1 ) (consisting of their inverses) also satisfies 
(HO) with constants (03,04) that depend only on c and c±. 

Lemma 3.1. Let A G M nxn be nonsymmetric but definite and let A G M. nxn be s.p.d 
Let (A, A ) be a pair of matrices that satisfy assumption (HO) with constants (c ,Ci) (in 
particular, A G M. nxn is nonsymmetric but definite and Aq G M. nxn is s.p.d). Then, the 

pair (A" 1 ,/!^ 1 ) also satisfies assumption (HO) with constants ^,CqM; that is, 
(3.1) v t A~ 1 y > % v t Aq 1 v, for all v G E re , 
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(3.2) w r A -1 v < — V v t Aq 1v V wT ^o V f° r al1 v ; w e R " 

_ 1 _1 

Proof. We first show the boundedness estimate (13.21) . We define the matrix Y := A 2 AA 2 . 
Then, ( 12. 3 p and (12. 4p imply (or read) that Y satisfy: 

(3.3) v T Fv > c ||v|| 2 , forallvGM n , 

(3.4) w T Fv < ci||v||||w||, for allv,w G R n . 

The positivity (13. 3p of Y guarantees the existence of Y~ x and so taking v := y -1 w in ( 13. 3p . 
and using the symmetry of the standard £ 2 -inner product of two vectors together with the 
Cauchy Schwarz inequality, we find 

c ||y _1 w|| 2 < w T y~ T w = w T y _1 w < ||w||||y _1 w||, 

which shows that ||y _1 w|| < ^ ||w||, that is, the boundedness of Y~ x in the £ 2 -norm: 

(3.5) \\Y^\\ < -. 
In other words we have shown that 

w T A|A -1 A|v = w T y -1 v < — || v || || w || Vv,w G K n . 

Co 

_1 _1 

Setting now in the above equation v := A 2 v and w := A 2 w, we reach the desired 
boundedness estimate ( 13. 2 p for A' 1 in terms of Aq . 

The positivity estimate (13.11) can be shown as follows. On the one hand, the boundedness 
(13. 4p of y with v = v and w = Y~ l \ gives 

II v || 2 = v t y(y -1 v) < ci \\y-\\\ ||v|| Vver. 

which readily implies 

(3.6) ||y _1 v|| > — II v II VveR n . 

Cl 

On the other hand, using the positivity estimate (13. 3 p of Y, we have 

v T y-V = (y- 1 v) r y T (y^ 1 v) = (y~\) t y(y- 1 v) > c \\y~\\\ 2 Vv g w 1 . 
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Then, the above relation together with estimate (I3.6P give the following positivity estimate 
for Y" 1 : 

v t y~\> % llvll 2 Vvel". 

_i 

Now, setting in the above estimate v := A 2 v, we obtain the coercivity relation ( 13. ip and 
conclude the proof. □ 

Next, let Bq denote the s.p.d. additive Schwarz preconditioner of Aq, whose inverse is 
defined as: 

(3.7) B-^PA^P^ + f^hA^ 1 ^. 

k=l 

Note that since (A, Aq) satisfy assumption (HO) with constants (co, ci), this immediately 
implies that for each k = 1, . . . N s the family of pairs (Ah, A k °^) with matrices defined by 

A k :=l T k AI k and A® := I^A I k k = l,...N a , 

also satisfy assumption (HO) with the same constants. The same is also true for the coarse 
pair of matrices (A c , A^), where A c = P T AP and A^ = P T A P. Then, applying Lemma 
13.11 to each of these pairs, we have that the corresponding pair of their respective inverses 

and hence the pair with the product matrices (i^A^I^ , I k A k ^ Iff) satisfies (HO) with 

constants (coc^ ,Cq~ ) (i.e, ( 13. ip and (13. 2p ). The latter implies that the inverses of the 
additive Schwarz preconditioners B" 1 (as defined in (12.21) ) and B^ 1 (as defined in ( 13. 7p ). 

are related in the same way (as their individual terms I k A^ l I k and IkA^ T£). That 
is: (B -1 , Bq 1 ) also satisfy (HO) with constants (cqC^ 2 , Cq 1 ) . Applying once more Lemma 
13.1^ we straightaway deduce that the pair (B,B ) also satisfy (HO), now with constants 
( C C 1 ) c i c o )• 

Now, since B is the classical s.p.d. additive Schwarz preconditioner for the s.p.d A , B 
and A can be shown to be spectrally equivalent: there exists 70,71 > such that 

(3.8) 1qv t Bq\ < \ t Aq\ < 7iV T J B v V v 6 1™ , 

where the constants 70 and 71 might depend on the parameters of the discretization and the 
problem. B would be optimal if neither 70 nor 71 depend on the discretization parameters 
(or size of the system n). 

Using this extra information, it is straightforward to deduce that the pair (B, Aq) also 
satisfy (HO) with constants (Po, Pi) that depend only on c ,ci,7o and 71. All these obser- 
vations are summarized in the following Lemma: 

Lemma 3.2. Let (A,Aq) satisfy assumption (HO) with constants (cq,ci). Let B be the 
additive Schwarz preconditioner of A (defined through ( 12. 2 j) ) and let Bq be the corresponding 
s.p.d additive Schwarz preconditioner of Aq (defined through (13. 7p ) and assume Bq is such 
that 

7 v T J B v < v T A v < 7 1 v T J B v, V v G W 1 , 
for some 70,71 > 0. Then, the pair (B,Aq) also satisfies (HO) with constants (Po,Pi): 

(3.9) v r i9v > Pq v T A v, for all v, 
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(3.10) w T Bv < ft y/v T A Q vy/w T A w, /or a// v, w. 

The constants /3q ari^ /?i are given by 



(3.11) 0o = Px = — . 

q7i c 7o 

M^e noie t/iat t/ie same results, fl3.9p -( l3~T0|) . hold for B replaced with B T . 

With all this relations at hand, we define the s.p.d. matrix 

(3.12) Z:=BA l B T . 

that can be used as a preconditioner for A in GMRES. Observe that the actions of Z~ l 
involve actions of both B" 1 and B~ T as well as multiplications with A (not Aq 1 ). There- 
fore, the preconditioner Z is computationally feasible. 

We next prove the main result of the section, which guarantees that the preconditioned 
GMRES method for A with the s.p.d. preconditioner Z = BAq X B t will be convergent 
with bounds depending only on the constants involved in relations between A and Z. 

Theorem 3.1. Let (A, A ) satisfy assumption (HO) with constants (cq,Ci) and let B £ 
jgmxn ^ e ^ g a ^m ve Schwarz preconditioner for A, whose inverse is defined through (12. 2p . 
Let Z := BA^B 7 be a preconditioner for A. Then, the pair (A, Z) also satisfies (HO) 
with constants (ao,«i) defined in (I3.14p . 

Furthermore, the preconditioned GMRES method for A with the s.p.d. preconditioner Z 
converges with bounds: 



||r m ||z-i = ||r m |U < ( 1 - -^=T) \\ r o\\z = f 1 _ "^7= ) IMIz- 1 



where f m = Z~ x v m = Z^ 1 (h — Ax m ) is the preconditioned residual at the m-th iteration 
with r = Z~ x r := Z~ x (h — Ax ); \\ ■ \\z and || ■ ||^-i are the inner-product norms induced 
by the s.p.d matrices Z and Z~ x , respectively. 

Proof. From Lemma [3.21 we know that (B,A ) satisfy (HO) with (/3 ,/3i). In particular, 

_i _i 

the relations (I3.9p - (l3.10p (used for B T ) show that X := A Q 2 B T A 2 is well-conditioned. 
More precisely, we have 

A) || v || 2 < ||Xvf < /3 a || v || 2 for all v £ R n . 

That is, the s.p.d. matrix X T X is well-conditioned. The coercivity of A in terms of Aq 

_ i _ i 

expressed in (12. 3p . and X T X being well-conditioned (or, bounded) imply that A Q 2 AA 2 
is coercive also in terms of X T X: 

v T A~^AA K > Co||v|| 2 > % T I T Iv for all v £ R n . 

Pi 

i i 

Hence, A is coercive in terms of AqX t XAq = BA^B 7 = Z, which is the first desired 
result. 
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Similarly, the boundedness of A in terms of A , expressed in ( 12. 4p . and X T X being 

_ 1 _ i 

well-conditioned (or coercive) imply that A Q 2 AA 2 is bounded also in terms of X T X: 

w T A~^AA K < ci||w||||v|| < ^-Vw T X T Xw Vv T X T Xv, for all v,w e R n , 

Po 

i i 

which is equivalent to say that A is bounded in terms of AqX t XAq = BAq 1 ^! 7 = Z. This 
completes the proof that the pair (A, Z) verifies assumption (HO) with constants («o, «i) 
defined by: 

(3.14) a = — = -=■ • 7o oil = — - 



A standard application of the GMRES convergence theory jH] gives (13.131) . and the proof 
of the theorem is complete. □ 



4. A COMBINED PRECONDITIONER 

In this section, we introduce another preconditioner which in a sense combines the best 
of both preconditioners B and Z = BA ~ 1 B T . We define its inverse by forming the 
linear combination 

ZT 1 = B~ l + aZ- 1 . 

The parameter a G R is allowed to change from iteration to iteration inside the GMRES 
iterative solver. Therefore, B can be regarded as a variable-step, flexible, preconditioner. 

Observe that for a > 0, by virtue of the analysis of the previous section, the pair 
((6 -1 ,Aq 1 ) verifies assumption (HO); i.e, B~ x is coercive and bounded in Aq 1 norm. We 
now describe the (practical) construction of the preconditioner B -1 , but considering a more 
general form: 

(4.1) B- 1 = aB- 1 + aZ- 1 

without assuming the coefficients a and a to have nonnegative sign. At the end of the 
section we provide the convergence result for B^ 1 which asserts faster convergence within 
GMRES than the one obtained with the preconditioner Z^ 1 . 

4.1. Construction of the variable preconditioner. We consider the system of equa- 
tions (12.11) that we solve by the preconditioned GMRES method with preconditioner B~ x 
as defined in (14. ip . We now explain how the coefficients are a and a set inside the GMRES 
iteration. Let || - 1| = \J (•, •) and || • H* = a/ (., .)* be two inner product norms, to be specified 
and chosen later on, and whose role will become clear in the process. 

For m > 0, we denote by x m the m^-iterate and by r m = b — Ax m the residual. At 
the (m + l) th iteration of GMRES, we construct the new search direction d m+ i based not 
only on the previous search directions {d^ }^ but also on the two preconditioned residuals 
B~ 1 r m and Z~ 1 r m , as follows: 

m 
j=0 
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Here, the coefficients (3j, j — 0,1, . . . , m are chosen such that 

(d m +i,dj)» = for j < m + 1 and ||d m +i||* = ^(d m+1 , d m+ i)* = 1. 

It is clear then, that the coefficients /3 m +j, s — 1,2 can be considered arbitrary parameters 
at this point. For any such fixed pair in GMRES, the next iterate x m+ i is then computed 
by minimizing the residual: 

m+l 

||b — Ax m+ i|| = ||b — A(x m + CKjdj) || h> min 

3=0 

with respect to the coefficients {c^}!^ 1 . Notice that out of the two coefficients (3 m+ ±, 
s = 1,2, only their ratio 

a = a m+1 = 



71+ | 



can be considered as a free parameter (the rest is compensated by the a m+ i-coefficient). 

In practice, we proceed as follows. At step m + l, based on the previous search direc- 
tions {dj}™ =0 and the preconditioned residuals i?~ 1 r m and Z~ 1 r m , we have to solve the 
minimization problem: 

m 

||b - A(x m + ^ a J d J + ®m+\ B ~ lr rn + « m+ |^^r m ) || h+ min , 

3=0 

with respect to the coefficients {ay}^ , and a m+ p s — 1,2. As we show next the solution 
of such minimization problem can be reduced to the solution of a 2 x 2 system, by choosing 
appropriately the inner product (-,-)*. 
Consider the quadratic functional J(at) 

m 

J(at) = ||r m - A(Y^ j atjdj + a m+ iB~ x r m + a m+ 2 Z~ l r m ) || 2 (->■ min. 

3=0 

as a function of the coefficients a = (a r ). Set the inner product (•, •)* = (A(-), -A(-)), which 
is equivalent to assuming that the search directions {dj} are (A(.), A(.)) orthogonal. Then, 
we have 

1 dj 

= -t^- = olj - (r m - a m+ iAB~ 1 r m - a m+ ^AZ~ v m , Adj), j < m, 
which gives 

(4.2) atj = (r m , Adj) - a m+ i_(AB~ l r m , Adj) - a m+ 2(AZ~ l r m , Adj), j < m. 

Setting now the partial derivatives of J w.r.t. a m+ ± to zero, we get 



a 



^iHAB^rJI 2 - (r m - A(£ o^-d,- + a m+ 2Z~ 1 r m , AB^r m ) = 0, 



3=0 



(4.3) 

a m+ 2\\AZ l r m \\ 2 - (r m - A(Y, ctjdj + a m+ iB l r m ), AZ l r m ) 

3=0 
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Substituting atj from ( 14. 2 p into ( 14. 3 p we end up with a system of two equations where the 

only two unknowns are the coefficients a m+ s. with s = 1,2: 

(4.4) 



l+ i [ \\AB- 1 r m \\ 2 -Y l {AB- 1 T m , Ad,] 



+a m+ 2_ (AZ- l v m , AB-hn) -^{AZ-h^ Ad j )(AB~ 1 v m , Adj] 



(r m , AB^rJ - J2(r m , Adj)(AB~ 1 r m , Adj) = (r m , AB _1 r m - ^(AB^, Ad J Ad 

j 

a m+ i \{AZ- l r m , AB- x r m )-Y,{AB- l r m , AdA{AZ-h m , Ad 



j 



+a m+2 _ WAZ-^f -^{AZ-^ Ad,-) 2 

3 V j 

= (r m , AZ- 1 ^) -£(r m , Ad^AZ-h^ Ad A = (r roj AZ- 1 x m -Y J {AZ- x x m , AdAAdA. 

j j 

To show the solvability of the above system for a m+ z with s = 1, 2 (which will imply that 
the preconditioner B~ x is well defined), we use the following lemma. 

Lemma 4.1. Let (H, (., .)) be a Hilbert space with inner product (•, •) and let h, f , g £ H . 

Let S be a finite dimensional subspace of H spanned by an orthonormal system {pjYfL l; 
i.e., (pi, Pj) = 5ij. Let tt — tts ■ H — > S be the orthogonal projection on S , with respect to 
the inner product (-, •). Then, the best approximation to h from elements from S augmented 
by the two vectors f and g is given as the solution of the least squares ( or minimization ) 
problem 

(4.5) min ||h — «i f — aag — a jPj II ^ mrn 



. 1 2 



1 m 



"3>3> 

1 2 



over the coefficients {a r }, r — 3,3, 1, . . . , m. Solving problem (14. 5p is equivalent to solve 
the following two-by-two system 



(4.6) 



||(/-7T)f|| 2 ((/-7T)f, (/ 

((J-vr)f, (J-Tr)g) ||(/-7r)g 



7T 



2 





Oil 




)■ 


3 






(X2 






L 3 - 





(h, (/ - 7T)f) 
(h, (/ - 7T)g) 



which has a unique solution provided (I — 7r)f and (J — 7r)g are linearly independent. If 
(I — 7r)f and (/ — 7r)g are linearly dependent, there is also a solution, since the r.h.s. in 
(14. 6 p is compatible. 

The remaining coefficients {a,} are computed from 7r(h — oaf — «2g) = £ a^Pj, that is: 

3 3 j 

OLj = (h — aif — «|g, Pj)- 

Proof. It is clear that the least-squares problem ( 14.51) reduces to finding the best approx- 
imation to (/ — 7r)h from the space spanned by the two vectors (J — 7r)f and (/ — 7r)g. 
Indeed, we can rewrite (14.51) as 



[I — iv)h — ai (I — 7r)f — «2 (/ — 7r)g — a 'jPj II ^ 



mm . 
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Since the last component Yl a 'jPj is orthogonal to (/ — 7r) ( h — aif — a2g], the above 

j V 3 3 / 

minimum equals 

min min || (/ — 7r)h — ai (I — 7r)f — a 2 (I — 7r)g — ^ &'jPj \\ 

a 1 , a 2 „' 3 3 a 

B~ 3 j ■> 



min min — 7r)h — qi (J — 7r)f — «2 (/ — 7r)g|| 2 + || a 'j~Pj\ 

«i,«2 „' 1 3 3 a 

3 3 "i \ J 



2 



min || (/ — 7r)h — ai(I — 7r)f — a? (/ — 7r)g| 

! 1 j 2 3 3 

min ( ||h — ai(I — 7r)f — «2 (/ — 7r)g|| 2 — ||7rh|| : 

1.(12 \ 3 3 



a l , a 2 
3 3 



The last problem leads exactly to the Gram system (14.61) . This completes the proof. □ 

We now apply last Lemma to our case, to show that the system (I4.4p has a solution and 
hence B~ x is well defined. We set h = A~ l v m) f = B~ x r m , g = Z~~ 1 r m and {pj} = {dj}™ =0 
for the vector space with inner-product (■, •)* = (A(-), A(-)). Using then that the {d,,} are 
(•, ■ ) ^-orthonormal, we conclude by applying Lemma [4. II that the system (I4.4p is solvable. 

Once the coefficients a m+ |, s = 1, 2 are determined, we compute the new direction d m+1 
from 

m 

An+id m+ i = a m+ iB 1 r m + a m+ 2 Z~ x v m — ^ Pjdj, 

j=0 

by choosing the coefficients {/3j}^i to satisfy the required orthogonality conditions 
(d m+ i, dj)* = (Ad m+ x, Adj) = for j < m + 1, 

which gives (assuming by induction that (dj, d^)* = Sj^) 

Pj = (« ra+ i 5 " lr m + a m+ |2 _1 r m , dj)* for j < m. 

The last coefficient, /3 m +i, is computed so that ||d m+ i||* = 1. 

4.2. Convergence. We close the section, by giving a result that provides an estimate for 
the convergence of the variable preconditioned GMRES. 

Theorem 4.1. Let (A, A ) satisfy assumption (HO) with constants (co,Ci) and let B £ 
jgmxn ^ e a ^^ni ve Schwarz preconditioner for A, whose inverse is defined through (12. 2p . 
Let Z := BAq 1 B t be a preconditioner for A. Let B be the variable preconditioner with 
inverse defined through (14. ip with coefficients determined inside the GMRES iteration by 
minimization of the residual. Then, the variable preconditioned GMRES method for A 
converges faster than the preconditioned GMRES method with preconditioner Z . 

Proof. The proof of the Theorem follows by the definition of B~ x (as explained before). 
From its construction it is straightforward to infer the following comparative convergence 
estimate 

||r m+ i|| < min ||r m - A(aB~ l + aZ~ l )T m \\ < min ||r m - aAZ~ l v m \\. 

a. a a 
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By choosing now || • || as the norm ||v||^-i = -\/v T Z _1 v, the combined preconditioned 
GMRES method will converge faster than the corresponding GMRES with preconditioner 
Z (that satisfies estimate (13 . 13[) as provided in Theorem 13. ip . □ 

5. Applications and numerical results 

In this section we present an application of the results and framework presented in the 
previous sections, that will allow us to verify the developed theory and will also asses the 
performance of the different preconditioners. 

The application we consider comes from a nonsymmetric Discontinuous Galerkin dis- 
cretization of an elliptic problem. In [2J [3], additive and multiplicative Schwarz precon- 
ditioners were developed for the solution of the above algebraic system. In both works, 
the authors show that the GMRES convergence theory cannot be applied for explaining 
the observed convergence since the preconditioned system does not satisfy the sufficient 
conditions for such theory. Here we aim at comparing the performance of the different 
preconditioners introduced in the previous sections, for such discretizations. 

More precisely, we consider the following model problem in Q = [0, l] 2 : 

—Au* = f in Q, u = on dQ , 

where the right hand side / is chosen so the exact solution is u* = sin(7rx) sin(7ry), and we 
focus on the Incomplete Interior Penalty Discontinuous Galerkin (IIPG) [5] discretization 
of the above model problem, with linear discontinuous finite element space (denoted by 
V DG ) on a shape regular triangulation of Q, denoted by Th- The resulting method reads: 
Find u G V DG such that 

a h (u,v) = [ fvdx V v G V DG . 
Jq 

The bilinear form of the IIPG method is given by 




for all u, v G V DG . Here, K G Th refer to an element of the triangulation, e C dK denotes 
an edge of the element and we have denote by Eh the set of all such edges or skeleton of the 
partition Th- We have used the standard definition of the average and jump operators from 
[I], and the penalty parameter rj is set to 5 in all the experiments. We denote by Ah the 
matrix representation of the operator associated to the bilinear form (15.11) . with standard 
Lagrange basis functions. Similarly, u and f denote the vector representations (in the same 
basis) of the solution (that we aim to compute) and the right hand side. In the end, the 
solution process amount to solve the nonsymmetric system: 

(5.2) A h u = f. 

The preconditioners we use are based on the standard two-level overlapping domain 
decomposition additive Schwarz preconditioner which we denote by B~ x . To define it, we 
consider an overlapping partition of Q into rectangular subdomains Qk which overlap each 
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other by an amount equal to the fine discretization size h. Then 

(5.3) B- 1 = I H A- H l I T H + hA- k x ll 

k=l 

Here the A k operators are restrictions of the original operator Ah to the finite element space 
V k that is only supported on Q k , that is they correspond to the bilinear forms, 

a k (u, v) = a h (u, v), Vtt, v G V k , 

as in [TTJ [7]. Since V k C Vh, the operators I k are standard injection. The operator Ah 
corresponds to the bilinear form (15.11) on a coarser discretization of the original domain 
Q, where we label the coarse discretization size H. We assume that the fine mesh is a 
refinement of the coarse mesh used to represent Ah so that Ijj is the natural injection on 
nested grids. The penalty parameter on the coarse grid is taken to be 5H/h in order to 
account for the difference of scales in the edge lengths in the penalty terms (see [21 [EE] , for 
further details). We implement these preconditioners on a parallel machine so that each 
subdomain is assigned to a processor and the subdomain solves can be done in parallel. 
Another preconditioner we consider is 

(5.4) Z- 1 = B- T A B-\ 

as outlined in the analysis above. Here A is a symmetric operator corresponding to the 
bilinear form 




where the penalty parameter r] = 5 is the same as it is in the full bilinear form (15. ip . 

In what follows we consider four different preconditioning techniques for the nonsym- 
metric system (15.21) . namely 

(1) the standard additive Schwarz preconditioner B~ l used in a right-preconditioned 
GMRES algorithm, for comparison with the other options, 

(2) the preconditioner Z~ l = B~ T AqB~ 1 again used in a right-preconditioned GMRES, 

(3) the flexible two-preconditioner GMRES variant with the two preconditioners B^ 1 
and 

(4) and the two-preconditioner GMRES variant with B~ l and B~ T as the two precon- 
ditioners. 

We note that in the third case, if we have applied B^ 1 to a vector u we can construct Z _1 u 
by applying B~ T A to save ourselves one preconditioner application. 

The number of GMRES iterations necessary to reduce the relative residual by 10~ 6 for 
our four different preconditioning approaches using various numbers of subdomains N s for 
a fixed problem is given in Table [TJ Here we solve the coarse problem involving A H X to 
a tolerance of 10 -10 so that this solve is nearly exact in order to satisfy the theory more 
closely. The preconditioning techniques are seen to be scalable in the sense that the number 
of iterations does not increase as N s increases for all four methods. Similarly, we show the 
convergence rates in Table HI where the convergence rate is defined as the factor by which 
the true residual is reduced in the last iteration. To get an idea of computational cost, 
we show the time to solution in seconds for the four approaches in Table [3j We conclude 
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Table 1. Iterations to convergence for h = 2 7 ,H = 2 5 with a nearly exact 
coarse solver and no restart. 



N s 


B- 1 and Z' 1 


B~ l and B~ T 


Z" 1 


B- 1 


4 


15 


16 


38 


16 


8 


15 


18 


42 


18 


16 


16 


18 


42 


18 


32 


16 


18 


44 


18 


64 


15 


17 


43 


17 


128 


16 


18 


46 


18 



Table 2. Convergence rate for h = 2 7 ,H = 2 5 with a nearly exact coarse 
solver and no restart. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


z- 1 


B~ l 


4 


0.29 


0.38 


0.65 


0.32 


8 


0.26 


0.41 


0.71 


0.47 


16 


0.34 


0.41 


0.61 


0.40 


32 


0.41 


0.40 


0.71 


0.41 


64 


0.29 


0.47 


0.64 


0.40 


128 


0.28 


0.51 


0.71 


0.50 



Table 3. Time to solution for h = 2 7 ,H = 2 5 with a nearly exact coarse 
solver and no restart. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


Z" 1 


B- 1 


4 


1.74 


1.79 


3.12 


0.84 


8 


1.39 


1.57 


3.22 


0.74 


16 


0.70 


0.75 


1.58 


0.34 


32 


1.66 


1.83 


4.63 


0.94 


64 


1.44 


1.66 


4.65 


0.88 


128 


2.70 


2.98 


8.59 


1.76 



that the Z~ x preconditioner is not competitive because it is the most expensive in terms 
of time per iteration and it also requires the most iterations. For this reason we do not 
consider it further in these numerical results. The two preconditioning techniques that use 
the two-preconditioner GMRES variant are seen to be effective in convergence rate but to 
be somewhat more expensive than the classical B~ l preconditioner, as we might expect. 

In practice for parallel computing applications the coarse solve in (15. 3 j) would not be 
done exactly. Another modification that is often made in practice is to restart GMRES 
after several iterations. In Tables HJ El and we repeat the previous experiment where 
the relative residual tolerance for the coarse solves is set to 10~ 4 and GMRES is restarted 
every 10 iterations. (These tables should be compared to Tables (H 121 and [3] respectively.) 
We see that the convergence behavior is quite similar and the computational cost is lower, 
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Table 4. Iterations to convergence for h = 2 7 ,H = 2 5 with an inexact 
coarse solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


B~ l 


4 


15 


18 


21 


8 


16 


18 


20 


16 


16 


18 


20 


32 


16 


18 


19 


64 


16 


18 


19 


128 


16 


18 


19 



Table 5. Convergence rate for h = 2 7 ,H = 2 5 with an inexact coarse 
solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


B~ l 


4 


0.19 


0.40 


0.49 


8 


0.36 


0.42 


0.45 


16 


0.34 


0.38 


0.44 


32 


0.32 


0.39 


0.44 


64 


0.23 


0.32 


0.39 


128 


0.25 


0.42 


0.34 



Table 6. Time to solution for h = 2 7 ,H = 2 5 with an inexact coarse 
solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


B~ l 


4 


1.59 


1.73 


0.99 


8 


1.29 


1.35 


0.72 


16 


0.59 


0.62 


0.35 


32 


0.73 


0.79 


0.46 


64 


0.67 


0.71 


0.44 


128 


0.89 


0.95 


0.60 



suggesting that these common modifications are also useful and effective for our precondi- 
tioning strategies. 

To see how these methods scale to larger problems, we consider a much finer mesh in 
Tables [3 El and[9l while keeping the mesh size for the coarse solve quite coarse. The scal- 
ability of the preconditioners in terms of iterations is still present, the preconditioner still 
performs well, and in these cases we can see fairly good parallel scalability in the sense that 
for a fixed problem size, doubling the number of processors in the parallel solve roughly 
cuts the execution time in half for all our preconditioning strategies. 

In all the results we have presented so far, the flexible two-preconditioner GMRES vari- 
ant has performed slightly better than the classical B~ x preconditioner in terms of number 
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Table 7. Iterations to convergence for h = 2 -10 , H = 2~ 6 with an inexact 
coarse solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


B~ l 


32 


30 


32 


31 


64 


30 


31 


32 


128 


30 


31 


31 


256 


30 


31 


30 



Table 8. Convergence rate for h = 2 10 ,H = 2 6 with an inexact coarse 
solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


B~ l 


32 


0.49 


0.67 


0.64 


64 


0.56 


0.68 


0.66 


128 


0.52 


0.64 


0.66 


256 


0.51 


0.68 


0.48 



Table 9. Time to solution for h = 2~ 10 ,if = 2" 6 with an inexact coarse 
solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B- 1 and B~ T 


B~ l 


32 


94.32 


95.55 


40.96 


64 


37.89 


37.59 


16.66 


128 


15.55 


15.60 


7.82 


256 


7.76 


8.81 


5.50 



Table 10. Iterations to convergence for h = 2 10 ,H = 2 9 with an inexact 
coarse solver and restarting every 10 iterations. 



N s 


B~ l and Z~ l 


B~ l and B~ T 


B~ l 


32 


19 


20 


23 


64 


17 


18 


22 


128 


15 


17 


21 


256 


15 


16 


20 



of iterations to convergence, but the increased computational cost per iteration of this GM- 
RES variant has ended up making the classical preconditioner perform better in execution 
time. This suggests that the new method may be competitive in settings where each itera- 
tion is very expensive, so that the savings in iteration count can make up for the increased 
cost per iteration. To examine this setting we consider a problem in Tables [TU1 and [TT1 where 
the coarse grid solve is done on a relatively fine grid and is therefore quite expensive. The 
results in this somewhat artificial setting do show that the new methods are competitive 
with the classical preconditioning techniques in terms of computational cost. 
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Table 11. Time to solution for h = 2~ W ,H = 2" 9 with an inexact coarse 
solver and restarting every 10 iterations. 



N s 


B- 1 and Z~ l 


B~ l and B~ T 


B~ l 


32 


277.50 


280.69 


352.85 


64 


170.94 


187.29 


133.73 


128 


79.01 


93.61 


99.50 


256 


40.92 


45.41 


39.59 
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